Hybrid Policy Learning for Multi-Agent Pathfinding
نویسندگان
چکیده
In this work we study the behavior of groups autonomous vehicles, which are part Internet Vehicles systems. One challenging modes operation such systems is case when observability each vehicle limited and global/local communication unstable, e.g. in crowded parking lots. scenarios vehicles have to rely on local observations exhibit cooperative ensure safe efficient trips. This type problems can be abstracted so-called multi-agent pathfinding a group agents, confined graph, find collision-free paths their goals (ideally, minimizing an objective function travel time). Widely used algorithms for solving problem assumption that central controller exists full state environment (i.e. agents current positions, targets, configuration static obstacles etc.) known they not straightforwardly adapted partially-observable setups. To end, suggest novel approach based decomposition into two sub-tasks: reaching goal avoiding collisions. accomplish task utilize reinforcement learning methods as Deep Monte Carlo Tree Search, Q-mixing networks, policy gradients design policies map agents’ actions. Next, introduce policy-mixing mechanism end up with single hybrid allows agent both types – individual one (reaching goal) (avoiding collisions other agents). We conduct extensive empirical evaluation shows suggested hybrid-policy outperforms standalone stat-of-the-art kind by notable margin.
منابع مشابه
A* Variants for Optimal Multi-Agent Pathfinding
Several variants of A* have been recently proposed for finding optimal solutions for the multi-agent pathfinding (MAPF) problem. However, these variants have not been deeply compared either quantitatively or qualitatively. In this paper we aim to fill this gap. In addition to obtaining a deeper understanding of the existing algorithms, we describe in detail the application of the new enhanced p...
متن کاملOptimal Multi-Agent Pathfinding Algorithms
The multi-agent path finding (MAPF) problem is a generalization of the single-agent path finding problem for k > 1 agents. It consists of a graph and a number of agents. For each agent, a unique start state and a unique goal state are given, and the task is to find paths for all agents from their start states to their goal states, under the constraint that agents cannot collide during their mov...
متن کاملIndependence Detection for Multi-Agent Pathfinding Problems
Problems that require multiple agents to follow noninterfering paths from their current states to their respective goal states are called multi-agent pathfinding problems (MAPFs). In previous work, we presented Independence Detection (ID), an algorithm for breaking a large MAPF problem into smaller problems that can be solved independently. Independence Detection is complete and can be used in ...
متن کاملMulti-Agent Learning with Policy Prediction
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting the basic gradient ascent approach with policy prediction. We prove that this augmentation results in a stronger notion of convergence than the basic gradient ascent, that is, strategies converge to a Nash equilibrium wi...
متن کاملMulti-Agent Pathfinding as a Combinatorial Auction
This paper proposes a mapping between multi-agent pathfinding (MAPF) and combinatorial auctions (CAs). In MAPF, agents need to reach their goal destinations without colliding. Algorithms for solving MAPF aim at assigning agents non-conflicting paths that minimize agents’ travel costs. In CA problems, agents bid over bundles of items they desire. Auction mechanisms aim at finding an allocation o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2021
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3111321